Data Days

What is R?

An OPEN-SOURCE programming LANGUAGE and free software ENVIRONMENT for STATISTICAL COMPUTING and GRAPHICS

Learn more at https://www.r-project.org/

Download R at https://cran.r-project.org/

The R Interface

> R uses a ...

> command line interface |

How does R work?

Basic Usage

"R doesn’t protect you from yourself: you can easily shoot yourself in the foot. As long as you don’t aim the gun at your foot and pull the trigger, you won’t have a problem."

- Hadley Wickham (Advanced R; 2014)

Hadley Wickham https://github.com/hadley

RStudio https://www.rstudio.com/

The RStudio IDE

Essential Packages: dplyr

Grammar for Data Manipulation

library(dplyr)
mtcars %>% 
    filter(mpg >= 15) %>% 
    group_by(cyl) %>% 
    summarise(numCARS = n(),
              avgMPG = mean(mpg),
              avgHP = mean(hp),
              medWT = median(wt),
              pctMANUAL = mean(am)) %>% 
    arrange(cyl)
# A tibble: 3 × 6
    cyl numCARS   avgMPG     avgHP medWT pctMANUAL
  <dbl>   <int>    <dbl>     <dbl> <dbl>     <dbl>
1     4      11 26.66364  82.63636 2.200 0.7272727
2     6       7 19.74286 122.28571 3.215 0.4285714
3     8       9 16.47778 198.77778 3.570 0.2222222

https://github.com/hadley/dplyr

Essential Packages: ggplot2

Publication Quality Graphics

library(ggplot2)
ggplot(data = mtcars, aes(x = hp, y = mpg)) + 
    geom_point() + 
    stat_smooth(method = lm)

https://github.com/hadley/ggplot2

Essential Packages: rmarkdown

Dynamic Documents, Presentations and Reports

  • Combine markdown with R code/output
  • Fully reproducible output
  • Many output formats (HTML, PDF, etc…)

Reference Guide

Cheat Sheet

https://github.com/rstudio/rmarkdown

Essential Packages: leaflet

Interactive HTML Maps

library(leaflet)
leaflet() %>% 
    addTiles() %>% 
    setView(lng = -81.6925,
            lat = 41.50132,
            zoom = 17) %>% 
    addMarkers(lng = -81.695174,
               lat = 41.501313,
               popup = paste0("<b>HIMSS Innovation Center</b>",
                              "<br>4th floor of the Global Center for Health Innovation",
                              "<br>1 St Clair Ave NE",
                              "<br>Cleveland, OH 44114"))

https://rstudio.github.io/leaflet/

Essential Packages: DT

Interactive HTML Tables

library(DT)
datatable(iris,
          extensions = "Scroller",
          options = list(
              scrollY = 320,
              scrollCollapse = TRUE
          ),
          rownames = FALSE)

http://rstudio.github.io/DT/

Essential Packages: dygraphs

Interactive HTML Time Series Plots

library(dygraphs)
lungDeaths <- cbind(mdeaths, fdeaths)
dygraph(lungDeaths) %>% 
    dyLegend(width = 300,
             show = "always") %>%
    dyRangeSelector(dateWindow = c("1974-01-01",
                                   "1979-12-31"))

http://rstudio.github.io/dygraphs/

Essential Packages: RODBC

ODBC Database Access

library(RODBC)
ch <- odbcConnect("Adhoc")
x <- sqlQuery(ch, "select * from dbo.MyTable;")
close(ch)
head(x)
  id    random1    random2
1  1 0.27246300  0.9990163
2  2 0.48424002  0.7186590
3  3 0.63359538 -0.4202009
4  4 0.00975549  1.4590855
5  5 0.75757926 -0.4867805
6  6 0.42360277  0.5008907

RODBC Documentation

http://www.unixodbc.org/

Essential Packages: XLConnect

Book1.xlsx

Excel Connector for R

library(XLConnect)
x <- readWorksheetFromFile(file = "../xlsx/Book1.xlsx", 
                           sheet = "Sheet1", 
                           startRow = 2, 
                           startCol = 2, 
                           endRow = 3, 
                           endCol = 3, 
                           header = FALSE)
y <- x + 4
wb <- loadWorkbook("../xlsx/Book1.xlsx")
setStyleAction(object = wb, 
               type = XLC$"STYLE_ACTION.NONE")
writeWorksheet(object = wb,
               data = y,
               sheet = "Sheet1",
               startRow = 5,
               startCol = 4,
               header = FALSE)
saveWorkbook(wb)

XLConnect Documentation

Thank You!